From Anchor to ROI

layer area

From layer i to layer i+1, assume the parameters on layer i are $s_i$ (stride), $p_i$ (patch), $k_i$ (kernel filter size), the width or height of layer i are $r_i$. Then, based on common sense,

$r_{i+1} = (r_i+2p_i-k_i)/s_i+1.$

In the reverse process, $r_i = s_i r_{i+1}-s_i-2p_i+k_i$ or $r_i = s_i r_{i+1}-s_i+k_i$ if counting in padding area.

coordinate map

Now consider mapping the point $x_i$ on the ROI to the point $x_{i+1}$ on the feature map, which can be transformed to the layer area problem above. In particular, the receptive field formed by left-up corner and $x_i$ on the ROI can be mapped to the region formed by left-up corner and $x_{i+1}$ on the feature map. Based on the similar formula for the layer area problem above (note the only difference is that we only include left padding and up padding, and subtract the radius of kernel filter $(k_i-1)/2$,

$x_i=s_i x_{i+1}-s_i-p_i+k_i-(k_i-1)/2.$

The above coordinate system starts from 1. When the coordinate system starts from 0,

$x_i+1=s_i (x_{i+1}+1)-s_i-p_i+k_i-(k_i-1)/2,$

which can be simplified as

$x_i=s_i x_{i+1}+(\frac{k_i-1}{2}-p_i).$

when $p_i=floor(k_i/2)$, $x_i=s_i x_{i+1}$ approximately, which is the simplest case.

By applying $x_i=s_i x_{i+1}+(\frac{k_i-1}{2}-p_i)$ recursively, we can achieve a general solution

$x_1 = \alpha_L x_{L}+\beta_L,$

in which $\alpha_L = \prod_{l=1}^{L-1} s_l$ and $\beta_L=\sum_{l=1}^{L-1} (\prod_{n=1}^{l-1} s_n)(\frac{k_l-1}{2}-p_l) $

anchor box to ROI

Given two corner points of an anchor box on the feature map, we can find their corresponding points on the original image, which determine the ROI.